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ABSTRACT 

PoIli is the only DNA polymerase equipped with 
template-directed and terminal transferase 
activities. Polu is also able to accept distortions in 
both primer and template strands, resulting in 
misinsertions and extension of realigned mis- 
matched primer terminus. In this study, we 
propose a model for human Poki-mediated di- 
nucleotide expansion as a function of the 
sequence context. In this model, Polu requires an 
initial dislocation, that must be subsequently 
stabilized, to generate large sequence expansions 
at different 5-P-containing DNA substrates, 
including those that mimic non-homologous end- 
joining (NHEJ) intermediates. Our mechanistic 
studies point at human Polu residues His 329 and 
Arg 387 as responsible for regulating nucleotide ex- 
pansions occurring during DNA repair transactions, 
either promoting or blocking, respectively, iterative 
polymerization. This is reminiscent of the role of 
both residues in the mechanism of terminal trans- 
ferase activity. The iterative synthesis performed by 
PoIli at various contexts may lead to frameshift 
mutations producing DNA damage and instability, 
which may end in different human disorders, 
including cancer or congenital abnormalities. 

INTRODUCTION 

Maintaining the integrity of the DNA sequence is essential 
for all living cells, which is notably allowed by maximizing 
fidelity during DNA replication and performing accurate 
repair of damaged DNA (1). Those processes require a 
large number of proteins including specialized DNA poly- 
merases. DNA polymerases have been categorized into 
four different groups attending to their biochemical 
properties and to the biological processes in which they 
are involved. They are grouped by their primary sequence 
homology into family A, B, Y and X. Among them, only 



X family DNA polymerases (PolX) are devoted to DNA 
repair, being evolutionarily conserved in prokaryotes, 
eukaryotes and archaea (2). DNA polymerases of the 
X family, which in mammals include DNA polymerase 
beta (PolB), lambda (Poll), mu (PoIli) and terminal 
deoxynucleotidyl transferase (TdT), are structurally 
related enzymes specialized in DNA repair pathways 
involving gaps and double-strand breaks (DSBs) 
[reviewed in (3)]. 

Human PoIll, consisting of 494 amino acids, has 41% 
identity to TdT, its closest homologue in the family. The 
structural similarities with TdT include a nuclear localiza- 
tion signal at the N-terminus, followed by a BRCT 
domain and the conserved PolB core (4). Regarding the 
biochemical properties, Polj.i displays both terminal trans- 
ferase and DNA-dependent DNA polymerization 
activities (4,5). The strong enhancement of these two 
activities by manganese ions, and the lack of proofreading, 
make PoIli a low-fidelity polymerase with a strong 
mutator behaviour. This mutator activity is further 
enhanced by its lack of sugar discrimination, allowing 
the use of both dNTPs and rNTPs (6,7). Moreover, two 
hallmarks of PoIli activity are: (i) the capacity to induce/ 
accept template distortions, in order to realign imperfectly 
paired DNA primers (8,9); and (ii) the capacity to bridge 
DNA ends with minimal or null complementary, 
contributing to the efficiency of end-joining by performing 
either templated or untemplated insertions at the 3' end 
termini (10,11). Likewise, PoIli is able to perform DNA 
synthesis despite the presence of mismatched nucleotides 
near the primer terminus (12). The combination of all 
these properties make PoIli well-suited for a role in the 
non-homologous end-joining (NHEJ) DNA repair mech- 
anism, as it was early proposed (5,9), and strongly 
supported by the demonstration of direct interactions of 
Polu with NHEJ factors (13-15), and by analysis of PoIli 
deficiency in various in vivo systems (13,16-20). 

On the other hand, this enzymatic versatility regarding 
the use and interactions with DNA and nucleotide 
substrates are the basis for Polu to exhibit a high 
misincorporation rate, being one of the most unfaithful 
polymerases known in higher eukaryotes (4). The strong 
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mutator activity of PoIli when copying a DNA template 
resides on its ability to create or accept distortions and 
realignments of both primer and template strands, being 
largely dependent on sequence context (8,9). Thus, it was 
found that many of the alleged misincorporations by Polu, 
including — 1 nt frameshift errors, were the result of base- 
pairing between the incoming nucleotide and complemen- 
tary positions close to the templating base (9,21). Analysis 
of different sequence contexts showed that the most 
extreme mutator behaviour by Polu corresponded to a 
'slippage' mechanism [(8); see Supplementary Figure SI], 
described initially for PolB on a template/primer (22). This 
slippage mechanism has been shown to be the cause of 
terminal transferase additions by PoIlx in the context of 
heavily damaged DNA, such as substrates containing 
AAF adducts or abasic sites (12,23). 

Tandem DNA repeats are mutational hotspots in the 
genome which undergo frequent length changes due to 
insertions (usually referred to as expansions) or deletions 
of repeat units. Variations in length of some repeats can 
produce phenotypic differences that are known to cause a 
large number of diseases (24,25). During the past decade 
several groups have demonstrated that the molecular 
mechanisms of repeat expansion or deletion are 
mediated by DNA replication, repair and recombination 
machineries (25-28). Methyl-directed mismatch repair 
(MMR) (29,30), nucleotide excision repair, base excision 
repair (31-33), transcription and DNA polymerase proof- 
reading have been described to be involved in the molecu- 
lar mechanisms of these unstable sequences (24,34), 
among other genetic factors. Moreover, the non- 
conventional DNA conformation of the repeated 
sequences could increase DNA polymerase stalling, 
facilitating strand slippage and producing mutagenesis 
and associated diseases (27). 

In this work we describe the contribution of a human 
DNA repair polymerase, Polu, to the expansion of di- 
nucleotides, consistently present throughout the whole 
genome, as a side effect of its role during NHEJ repair 
of double-strand breaks in DNA. 



MATERIALS AND METHODS 

DNA and proteins 

Unlabelled ultrapure dNTPs and [y- 32 P] ATP (3000 Ci/ 
mmol) were purchased from Amersham Pharmacia 
Biotech. Synthetic DNA oligonucleotides were obtained 
from Invitrogen: D (5'-ACGACGGCCAGT) was used as 
downstream oligonucleotide; 3'-(GAC) 9 (5'-ACTGGCCGT 
CGTCAGCAGGTACTCACTGTGATC), 3'-(GTC) 2 (5'-A 
CTGGCCGTCGTCTGCTGGTACTCACTGTGATC), 3'- 
(AAG) 2 (5' - ACTGGCCGTCGTGAAGAAGTACTCACTG 
TGATC), 3'-(CTT) 2 (5' - ACTGGCCGTCGTTTCTTCGTA 
CTCACTGTGATC), 3'-(CAG) 2 (5' - ACTGGCCGTCGTG 
ACGACGTACTCACTGTGATC), 3'-(GAA) 2 (5'-ACTGG 
CCGTCGTAAGAAGGTACTCACTGTGATC), 3'-GAA 
(5'-ACTGGCCGTCGTAAGGTACTCACTGTGATC), 
3' -AAA (5'-ACTGGCCGTCGTAAAGTACTCACTGTGA 
TC), 3'-CAA(l) (5'-ACTGGCCGTCGTAACGTACTCAC 
TGTGATC), 3'-CAA(2) (5'-ACTGGCCGTCGTAACCTA 



CTCACTGTGATC), 3'-TAA (5'-ACTGGCCGT 
CGTAATGTACTCACTGTGATC), 3'-CTT (5'-ACTGGC 
CGTCGTTTCCTACTCACTGTGATC), 3'-CCC (5'-ACT 
GGCCGTCGTCCCCTACTCACTGTGATC), 3'-CGG (5'- 
ACTGGCCGTCGTGGCCTACTCACTGTGATC) were 
used as templates; PI (5'-GATCACAGTGAGTAC), P2 
(5'-GATCACAGTGAGTAG), P3 (5'-GATCACAGTGAG 
TACCC) and P4 (5'-GATCACAGTGAGTACCCC) were 
used as primers. Oligonucleotide D contains a phosphate 
at 5' -end. DNA oligonucleotides used as primers (PI and 
P2) were labelled at its 5' -end with [y 32 P-ATP] and T4 poly- 
nucleotide kinase. To construct the different gapped mol- 
ecules used in the polymerization assays, PI was 
simultaneously hybridized to a downstream oligonucleotide 
D and one of the following template oligonucleotides: 
3'-(GAC) 2 , 3'-(GTC) 2 , 3'-(AAG) 2 , 3'-(CTT) 2 , 3'-(CAG) 2 , 
3'-(GAA) 2 , 3'-(GAA)i, 3'-(AAA) l5 3'-(CAA)i or 3'-(TAA) i; 
P2 was simultaneously hybridized to D and one of the fol- 
lowing template oligonucleotides: 3'-CAA, 3'-CTT, 3'-CCC 
or 3'-CGG; P3 and P4 were hybridized to D and 3'-CGG. 
For NHEJ assays, oligonucleotides D3 (5'-CCCTCCCTCC 
GCGGC), D3BB1 (5'-CCCTCCCTCCGCGGAC), D3BB2 
(5'-CCCTCCCTCCGCGAGC), D3T (5'-CCCTCCCTCCG 
CGGT), D3TT (5'-CCCTCCCTCCGCGGTT) and D3G 
(5'-CCCTCCCTCCGCGGG) were used as labelled 
primers, hybridized to oligonucleotide 5'-GGGAGGGAG 
GC, while oligonucleotides D4 (5'-CGCGCACTCACGTC 
CCGGCC) or D4-AA (5'-CGCGCACTCACGTCCCAA 
CC) were hybridized with the 5' -phosphate-containing Dl 
(5'-GGGACGTGAGTGCGCG) to form a template/down- 
stream substrate. Hybridizations were performed in the 
presence of 50 mM Tris-HCl pH 7.5 and 0.3 M NaCl. All 
the oligonucleotides described above were purified by elec- 
trophoresis in 8 M urea/20% polyacrylamide gels. T4 poly- 
nucleotide kinase and T4 DNA ligase were from New 
England Biolabs. Both highly purified wild-type human 
Pol(i and wild-type human Poll were obtained as previously 
described (4,35). Purification of mutants H329G, R387K 
and R387A was performed as described (10). 

Purification of human PolB 

Escherichia coli cells expressing human PolB were ground 
with alumina and the resulting lysate was centrifuged and 
cleaned by precipitation with ammonium sulphate. The 
precipitate was subjected to phosphocellulose chromatog- 
raphy followed by HiTrap Heparin (Pharmacia Biotech) 
chromatography. The column was loaded onto a glycerol 
gradient and centrifuged at 62.000 rpm for 24 h, and frac- 
tions were examined in Coomassie Blue-stained gels and 
tested for DNA polymerase activity on activated (DNasel 
treated) DNA. 

DNA polymerization assays 

Different DNA substrates containing 5'-P labelled primers 
(described above) were incubated with the indicated 
amounts of either the wild-type or mutant human Polu, 
human Po\X or human PoB, in each case. The reaction 
mixture, in 20 ul, contained 50 mM Tris-HCl pH 7.5, 
ImM DTT, 4% glycerol and O.lmg/ml BSA, in the 
presence of 4nM of the indicated DNA polymerization 
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substrate, 50 uM of the dNTP or the mix of dNTPs 
indicated in each case, 2mM MgCl 2 and amounts of 
DNA polymerase as indicated in the figure legends. 
After the incubation indicated in each case, reactions 
were stopped by adding gel loading buffer [95% (v/v) 
formamide, 10 mM EDTA, 0.1% (w/v) xylene cyanol 
and 0.1% (w/v) bromophenol blue]. Samples were 
analysed by 8M urea/20% PAGE and either autoradiog- 
raphy or Phosphorimager scanning. 

3D-modeling 

The different conformations of selected residues of Polu 
were analysed by using the Swiss PDB- Viewer (http:// 
www.expasy.ch/spdbv/) and MacPymol (The PyMOL 
Molecular Graphics System, Schrodinger, LLC, http:// 
www.pymol.org/). 



RESULTS 

Polu, but neither PolX nor PolB, produces large 
sequence expansions when a specific repeated 
trinucleotide sequence is used as a template 

Polu can act as a mutator polymerase, based on its ability 
to realign and dislocate DNA chains during polymeriza- 
tion, whereas PolA and PolB, belonging to the same family 
as PoIli, are not so versatile and promiscuous in the use of 
DNA substrates (36). All three polymerases have a 
marked preference for 5'-P gapped molecules, as the 
8kDa domain strongly interacts with this group at the 
5'-end of the downstream chain (35,37,38), improving 
binding stability. Thus, we addressed if any of these 
DNA polymerases could generate triplet repeat expan- 
sions in the context of a gap. According to the substrate 
preferences of these polymerases, four different DNA sub- 
strates containing a 6-nt gap were designed, with different 
trinucleotide sequences as templates, each one repeated 
twice (see scheme in Figure 1). Expansion of triplet repe- 
titions of the sequence 3'-GAC (or its complementary 
chain 3'-GTC) are associated to Huntington's disease; 
triplet repeat expansions of 3'-AAG (or its complementary 
sequence 3'-CTT) are related to Friedrich's ataxia. 
Polymerization reactions were assayed on these molecules, 
in the presence of either Polu, PolX or PolB, and the com- 
plementary dNTPs needed in each case (Figure 1). 
In general, PoU and PolB filled the gap and then polymer- 
ization stopped (+6 product). In the case of 3'-(GTC) 2 , 
PolB displayed a significant strand-displacement capacity 
and efficiently continued adding nucleotides after filling 
the gap (+18 product). This imprecise gap-filling by 
PolB has been already described (39). Interestingly, Polu 
was eventually able to continue polymerization, producing 
extra additions that could be also originated via strand 
displacement. However, this behaviour is very remarkable 
in the case of 3'-(CTT) 2 , where Polu fills the gap and 
then efficiently generates a very large DNA expansion as 
a final product. The main difference between this sequence 
and the others is the presence of a dinucleotide (TT) in the 
template strand at the end of the gap. 



Polu requires a dinucleotide at the end of the template 
sequence to produce large sequence expansions 

Taking into account the dislocation capacity of Polu, 
strongly driven by the presence of dinucleotides in the 
template strand, we wanted to confirm if the presence of 
a repeated nucleotide at the end of the gap determines the 
capability of Polu to produce sequence expansions. For 
this, two new DNA substrates containing 6-nt gaps were 
chosen: the first, with the template sequence 3'-(CAG) 2 , 
with no repeated nucleotides, and the second 3'-(GAA) 2 , 
which maintains the presence of a duplicated nucleotide at 
the end of the gap (Figure 2). We compared the behaviour 
of Pol^i and PoU. on these substrates, providing the com- 
plementary nucleotides needed in each case. As shown in 
Figure 2, in both cases Poll filled the 6-nt gap and then 
polymerization stopped. Conversely, Pol^i was able to 
continue polymerization beyond the expected +6 
product in both cases, producing a large sequence expan- 
sion only in the case of confronting a repeated nucleotide 
(AA) at the end of the gap (Figure 2). Thus, this iteration 
is a relevant requirement for PoIli in order to produce 
promiscuous elongation. 

Next we addressed if the repetition of the trinucleotide 
was specifically needed for Polu to produce the expansion. 
For that, Polu was assayed on a DNA substrate having 
only one repetition of the triplet sequence, forming a 3-nt 
gap containing the sequence 3'-GAA, that maintains the 
requirement of a repeated nucleotide at the end of the gap. 
As shown in Figure 2, when complementary nucleotides 
(dC and dT) were provided, Polu generated again a large 
sequence expansion, demonstrating that the dinucleotide 
is required only once at the end of the template strand. 

An initial dislocation and remaining distortion are 
necessary for the generation of large sequence 
expansions by Polu 

By using additional sequence contexts, Polu was con- 
firmed to be able to produce large sequence expansions 
of the four different duplicated nucleotides (AA, TT, 
CC and GG) at the end of a 3-nt gap (Figure 3A). PoU 
was used as negative control of expansion also in this ex- 
periment. Interestingly, in the 3'-CCC gapped substrate 
Polu had a different behaviour than on the other 
3 DNA substrates: no large sequence expansion was 
produced, but only a few more additions were observed 
once the gap had been filled. Strikingly, the behaviour 
appears to be the opposite in the case of PolA, which 
produced some significant expansion only when copying 
the sequence 3'-CCC, perhaps the most prone to facilitate 
slippage after gap-filling. Thus, the modest outcome of the 
reaction on the 3'-CCC substrate indicates a new specific 
feature needed for the generation of large expansions by 
Polu: the nucleotide at the first (n+ 1) template position 
must differ from that forming the repetition in order to 
obtain the maximal sequence expansion. This demand at 
the trinucleotide sequence suggests that the expansion 
reaction begins with an initial dislocation reaction, a 
very particular mechanism described for Polu (8,9), 
producing a —1 frameshift during the initial step of the 
gap-filling reaction: Pole's propensity to dislocate 
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3' 



CTAGTGTCACTCATG ( NNN ) 2 TGCTGCCGGTCAl 



GATCACAGTGAGTAC 



ACGACGGCCAGTl 



5' r 3' 

3'-(GAC) 2 3'-(GTC) 2 3'-(AAG) 2 3'-(CTT) 2 




(jX.(J-|jXfi-(jX(J - 



Figure 1. Polu generates a large sequence expansion on a DNA gap 
with a specific trinucleotide sequence. In the scheme, the template 
sequence indicated (NNN) corresponds to the trinucleotides shown 
below. Subindex '2' indicates that each trinucleotide sequence is twice 
repeated at the gap. Polymerization reactions (described in 'Materials 
and Methods' section) were performed in the presence of 4 nM of the 
indicated DNA substrate in each case, 2 mM MgCl 2 , 270 nM of each 
polymerase and 50 uM mix dNTPs (using the complementary nucleo- 
tides to the gap sequence in each case). After 1 h incubation at 30°C, 
polymerization products were analysed by electrophoresis on 20% 
polyacrylamide/8 M urea gels and autoradiography. P indicates the 
unextended primer; +6 the elongated product upon complete 
gap-filling; +18 the fully elongated product after gap-filling and 
strand displacement; the products of nucleotide expansion are indicated 
with a bracket. 



the template strand is such that it preferentially inserts the 
nucleotide complementary to the n + 2 position of the 
template, both in an open template/primer and in a 2-nt 
gap (Supplementary Figure SI), either in the conditions 
that favour a mechanism of primer slippage 
(Supplementary Figure SI A) or when this is not possible 
due to the sequence context (Supplementary Figure SIB). 
This second mechanism, 'dNTP-selection-mediated', is 
not as efficient as the 'slippage-mediated' alternative, but 
is considerably improved when the substrate is a small 
5'-P-containing gap (Supplementary Figure SIC). 

To further understand the requirements for the special 
mechanism of dinucleotide expansion by Polu, we 
evaluated both the impact of the sequence at the first 
position of the gap, and also the necessity of providing 
either the 2 nt complementary to the sequence of the 
gap, or only the one complementary to the dinucleotide 
(the only one needed if dislocation occurs). We used four 
different 3-nt gapped DNA substrates (Figure 3B), with 
the same terminal repetition (AA), but different nucleotide 
at the n + 1 position of the gap. As shown in Figure 3B, 
a large expansion is obtained in the case of the 3'-GAA 
substrate, and no difference is obtained when providing 
either dC + dT or only dT, indicating that in most of 
the cases the first templating base (dG) is not copied. 
Such a preferential dislocation is compatible with 



TCATG ( NNN ) TGCTGCCGGTCAj 
ACGACGGCCAGT 



5' P > 3' 

3'-(CAG) 2 3'-(GAA) 2 S'^GAA^ 




G/T/C 



Figure 2. A repeated nucleotide at the end of a gap is required to 
generate large sequence expansions. In the scheme, the template 
sequence indicated (NNN) corresponds to the trinucleotides shown 
below, that could be present twice (subindex 2) or only once 
(subindex 1). Polymerization reactions (described in 'Materials and 
Methods' section) were performed in the presence of 4nM of the 
indicated DNA substrate in each case, 2mM MgCl 2 , 270 nM of 
either PolA or Polu and 50 uM mix dNTPs (using the complementary 
nucleotides to the gap sequence in each case). After 1 h incubation at 
30°C, polymerization products, including sequence expansions, were 
analysed by electrophoresis on 20% polyacrylamide/8 M urea gels 
and autoradiography. 



a slippage-mediated mechanism, since the 3'-terminus of 
the primer strand (dC) could be realigned and matched to 
the first templating base (dG) [(40); see also 
Supplementary Figure S1A]. Moreover, large expansions 
on the gap sequences 3'-TAA and 3'-CAA are also 
obtained by providing only dTTP. In these two cases, 
the dTTP insertion event triggering expansion would be 
compatible with a dNTP-selection-mediated dislocation 
[(8); see Supplementary Figure SIB]. This mechanism 
dominates insertion in the 3'-TAA gap, as the addition 
of dATP + dTTP has minimal, if any, effect on expansion. 
Conversely, expansions on the 3'-CAA gap are signifi- 
cantly inhibited by the simultaneous addition of dGTP 
and dTTP. These results point to the efficiency of disloca- 
tion as an important factor determining the expansion 
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3' 5' B 3 ' 5 ' 

CTAGTGTCACTCATG ( CNN ) TGCTGCCGGTCA CTAGTGTCACTCATG ( NAA ) TGCTGCCGGTCA 

ACGACGGCCAGTj ACGACGGCCAGT 

5' P 3' 5' P 3' 



3'-CAA 3'-CTT 3'-CCC 3'-CGG 



3'-AAA 3-TAA 3'-GAA 3'-CAA 




dNTP T/G A/G G C/G 



Pol - \ |J - X u - Xjj - X u 




- 5— E8 



dNTP - T - A/T A T - C/T C T - G/T G T 



Figure 3. A distortion upstream to the dinucleotide is a prerequisite for the formation of Polu-mediated sequence expansions during gap-filling. 
(A) In the scheme, the template sequence indicated (CNN) corresponds to the four trinucleotides shown below. The first base at the trinucleotide is 
always dC, followed by the four different homo-dinucleotides. Polymerization reactions (described in 'Materials and Methods' section) were per- 
formed in the presence of 4nM of the indicated DNA substrate in each case, 2mM MgCl 2 , 270 nM of either PolA or Polu and 50 uM mix dNTPs 
(using the complementary nucleotides to the gap sequence in each case). (B) Importance of the first nucleotide of the triplet (NAA) for the efficiency 
of expansion by Polu. The gap sequence used and the different combinations of nucleotides provided, were as indicated. Polymerization reactions 
were performed as in (A). After 1 h incubation at 30°C, polymerization products were analysed by electrophoresis on 20% polyacrylamide/8 M urea 
gels and autoradiography. 



capability, which could be also affected by an unbalanced 
concentration of the deoxynucleotide precursors. 

In agreement with our previous observations using the 
3'-CCC substrate (Figure 2), PoIli did not produce a sig- 
nificant expansion on the 3'-AAA substrate. Assuming 
that an initial dislocation event can also occur, an import- 
ant difference in these two cases is that, after complete 
gap-filling, the nascent chain could be completely 
realigned, producing a perfectly matched +3 product. In 
this case, expansion seems to be precluded, as evidenced 
by the minimum extension of the primer over the size of 
the gap. Therefore, an initial dislocation of the template 
only triggers dinucleotide expansion if the distortion 
remains after gap-filling. Further analysis showed that 
the necessary distortion that allows generation of the 
observed sequence expansions might be present in the sub- 
strate prior to the arrival of the polymerase, triggering 
dinucleotide expansions even at single nicks in DNA 
(Supplementary Figure S2). As it will be evaluated later 
in this section, that is particularly relevant considering the 
in vivo situations where Polu deals with substrates contain- 
ing distortions and/or misalignments, caused by 
microhomology search during the bridging of two DNA 
ends in NHEJ reactions. The presence of a 5'-P group at 



the downstream strand of the gap is also a prerequisite for 
the generation of these nucleotide expansions, as shown in 
Supplementary Figure S3. This observation was expected 
since the 5'-P is a main anchor point for Polu on the DNA 
substrate (38). 

Specific residues regulating the expansion of 
repeated sequences 

PoIli is an exceptional enzyme since it is the only DNA 
polymerase able to display template-independent 
(terminal transferase) and template-dependent activities 
(4,5). Recent structure-function studies have shed light 
on the molecular basis for the terminal transferase 
activity: on the one hand, Polu contains a flexible piece, 
Loop 1, which is able to undergo conformational changes, 
acting as a pseudo-template that allows incorporation of 
nucleotides in the absence of template information (41), or 
during NHEJ of some incompatible ends (15). Moreover, 
the crystal structure of Polu in ternary complex with gap 
DNA and deoxynucleotide (42) allowed to infer that a 
specific histidine residue (His 3 9 in Polu and His 342 in 
TdT; absent in PolB and PoU) could play an important 
role to overcome the rate-limiting step of untemplated 
polymerization, allowing terminal transferase activity 
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to occur. In fact, this residue, involved in the proper pos- 
itioning of the primer terminus and the incoming nucleo- 
tide (Figure 4A), was shown to be critical during template- 
independent polymerization but also in template-directed 
reactions associated to NHEJ of short incompatible ends 
(10,42). Furthermore Arg is a specific DNA-binding 
ligand (Figure 4A) that limits untemplated nucleotide 
additions and is thus responsible for the lower terminal 
transferase activity of Polti in comparison to TdT (10). 

To study the relevance of both residues in the expansion 
of repeated sequences we used two mutants: H329G (with 
a strongly reduced terminal transferase activity) and 
R387K (displaying an augmented terminal transferase 
activity). The selected mutants, obtained and purified as 
described (10), were tested on the set of gapped DNA 
substrates that maintain the same repetition (AA) at the 
5'-end, but differ in the nucleotide at the n + 1 position in 
the gap. As shown in Figure 4B, the wild-type Polu 
produced and expansion pattern similar to that described 
before (Figure 3B). Strikingly, mutant H329G was only 



able to fill the different gaps, displaying very minor or 
even undetectable levels of sequence expansions on these 
substrates. Mutant R387K, on the other hand, produced a 
level of expansions similar in most cases to that of the 
wild-type enzyme (Figure 4B), but, remarkably, was also 
able to produce a significant expansion in the case of the 
3'-AAA substrate. This was striking, as the latter substrate 
does not allow formation of the distortion that was an 
obligatory requirement for producing expansions by the 
wild-type enzyme. These results support our initial 
hypothesis that Arg 387 has a constitutive role in preventing 
slippage of the primer in each round of the catalytic cycle, 
through a direct contact between this residue and the —2 
position of the primer strand that can be observed in the 
ternary complex of Polti (Figure 4A). Mutation of this 
arginine to lysine favours expansion in the absence of dis- 
tortions through a mechanism that facilitates the back- 
wards translocation of the primer strand. In the case of 
distortion-mediated expansions, where the primer strand 
is not properly oriented, Arg 387 cannot exert its 'braking' 




Figure 4. His 329 allows while Arg 387 limits sequence expansions by Polu, (A) Top: Model of Polu bound to a template/primer substrate in which the 
3'-protruding primer terminus is in an unproductive position, occupying the incoming dNTP site. Residue Arg 387 , in a conformation modelled to 
match that of the lysine present in TdT in a similar structure, is contacting the primer impeding its backwards translocation. His 329 , modelled in the 
conformation observed for the same residue in the crystal of TdT bound to an ssDNA primer, is not making any contacts with the DNA substrate. 
Bottom: Crystal structure of Polu bound to a gapped DNA substrate and incoming dNTP. His 329 has rotated and is contacting both incoming dNTP 
and primer terminus, helping to reposition the latter. Arg 387 is now contacting the template strand (n — 3 position; indicated in yellow), having 
allowed the movement backwards of the primer. DNA substrates are indicated in dark (primer strand) and light (template and downstream strand) 
blue. Incoming dNTP is indicated in green. (B) In the scheme, the template sequence indicated (NAA) corresponds to the four trinucleotides shown 
below. The two last bases always form the dinucleotide AA, preceded by any of the four different nucleotides. Polymerization reactions (described in 
Materials and Methods section) were performed in the presence of 4nM of the indicated DNA substrate in each case, 2mM MgCl 2 , 270 nM Polu 
and 50 uM mix dNTPs (using the complementary nucleotide to the dinucleotide AA). After 1 h incubation at 30°C, polymerization products were 
analysed by electrophoresis on 20% polyacrylamide/8 M urea gels and autoradiography. 
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role. Mutation of arginine to lysine in this context does 
not increase the sequence expansions any further since the 
control mechanism is already compromised. 

Impact of the sequence expansions during 
end-joining reactions 

As noted before, dinucleotide expansion could be 
generated by Polu as a by-product of its physiological 
role of repairing DSBs by the NHEJ pathway. We 
wanted to check whether the sequence expansion 
capacity exhibited by Pol(i in the context of DNA- 
gapped substrates is also demonstrated during NHEJ re- 
actions that include the formation of short gaps, and if the 
requirements detected during gap-filling (i.e. the formation 
of a distortion upstream of the polymerization point, and 
the presence of a dinucleotide in the template strand) also 
need to be met during end-joining of two DNA ends. 

For this we used a tailored set of 3'-protruding NHEJ 
substrates whose protrusions provide, once bridged by the 
polymerase, a microhomology of 3 base-pairs, the 
flipping-out of a nucleotide at the —1 position of the 
primer strand, and the formation of two 1-nt gaps at 
both sides of the connection (see scheme in Figure 5A). 
Radioactive labelling of one of the 3'-protruding strands 
allows detection of nucleotide incorporation on this end, 
using the template information provided in trans by the 
other 3'-protrusion, that contains a 5'-P-group and will 
thus be used as a template/downstream structure 
(through binding of the phosphate by the 8kDa domain 
of Polu). In order to evaluate if the reaction is 
trans-directed, the templating base (X) on this second 
DNA substrate was changed to A, C, G or T (Figure 5). 
Our results showed that Pori is able to perform an efficient 
and mostly template-directed trans-polymerization on this 
kind of NHEJ substrates, since the polymerase 
incorporated preferentially the nucleotide complementary 
to the templating base (white asterisks and box in 
Figure 5A). As expected, when the templating base is a 
G and thus the template strand contains a dinucleotide 
(GG), large sequence expansions were produced when 
providing the nucleotide (dCTP) complementary to the 
dinucleotide. The reiterative additions of dGTP that are 
also catalysed by Polu in every case can be considered 
pure terminal transferase incorporations, since control ex- 
periments in which the template-providing end is not 
present, also showed this outcome (Supplementary 
Figure S4). This is in accordance with the preference for 
nucleotide incorporation displayed by Polu during 
untemplated additions (41). 

Strikingly, in the context of NHEJ we were able to 
observe the formation of large sequence expansions even 
in the absence of a provided distortion upstream to the 
polymerization site (Figure 5B), thus limiting the require- 
ments of this reaction only to the presence of a dinucleo- 
tide, a prerequisite that still needs to be met in this context. 
The amount of nucleotide that Polu requires to produce 
these expansions is low (20 uM), indicating that this 
process is highly efficient (Supplementary Figure S5A 
and B). Again, dGTP is being inserted as a result of 
pure terminal transferase additions, as demonstrated by 



a control experiment, in which the template-providing 
end is not present (Supplementary Figure S4). We also 
detected expansion during NHEJ when the dinucleotide 
was formed by a pair of adenines (Supplementary Figure 
S5E and F), indicating that this mechanism is independent 
of the sequence context. Taken together, our results 
indicate that if the sequence context is favourable to the 
expansions (i.e., iteration of nucleotides at the template 
strand), the polymerase itself may generate the required 
upstream distortion by adjusting the bridging of the two 
ends, a scenario that emphasizes the importance of the 
mutagenic potential of Polu during the NHEJ pathway, 
specifically regarding nucleotide expansions. 



DISCUSSION 

Indels (insertions and deletions) are common errors 
produced during DNA replication and repair, related to 
with different human pathologies including cancer and 
diseases associated with expansion of repeats. All poly- 
merases studied to date generate indels during DNA syn- 
thesis in vitro (43), but with very different frequency. Thus, 
although X family members PolG, PolX and Polu all 
generate single-base deletions during synthesis (9,21,22), 
PolA. has a much higher deletion rate, whose structural 
basis has been proposed (44). The first hypothesis explana- 
tory of the production of indels was introduced by 
Streisinger et al. (40): these frameshift mutations were 
described as products of strand slippage in repetitive 
DNA sequences. Other two models have been proposed 
since then, namely 'direct misincorporation misalign- 
ment', in which a polymerase introduces an initial 
mismatch that causes the primer terminus to be subse- 
quently realigned (44,45) and 'dNTP-stabilized misalign- 
ment', in which incorporation of the correct dNTP occurs 
in front of a complementary downstream template base 
(46,47). 

The results presented here demonstrate that human 
Polu can catalyse large nucleotide expansions when 
copying a repeated templating base in the vicinity of a 
5'-P. Based on our findings with different sequence 
contexts, we propose a specific model for Polu-niediated 
generation of the expansions during gap-filling that 
requires: (i) initial dislocation of the template strand; 
(ii) generation of a mismatch/distortion, that will trigger 
nucleotide expansion (Figure 6B and C). Initial template 
dislocation can be either facilitated by slippage, when the 
primer-terminus is complementary to the first nucleotide at 
the gap (Figure 6B), or stabilized by the incoming nucleo- 
tide (dNTP-mediated; Figure 6C); in both cases, after 
initial dislocation (stage 1), the gap is filled and a 
mismatch is left behind (stage 2), and then expansion 
occurs (stage 3). However, such expansions are not 
simply the result of Streisinger's 'strand slippage': if the 
sequence to be copied is formed by the three same nucleo- 
tides (Figure 6A), the expansion hardly happens because 
the mismatch/distortion that could trigger further nucleo- 
tide incorporation beyond gap-filling is not allowed 
to occur. 
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Figure 5. Impact of the generation of sequence expansions during NHEJ repair reactions by Polu. (A) The scheme corresponds to the end-joining 
substrates used, whose 3'-protrusions can be connected by three bases pairs but leaving a distortion (1 flipped-out base) close to the 3'-primer 
terminus. Such a connection leaves two different 1-nt gaps. Gap-filling of one of them (that flanked by a 5'-P) is evaluated as a function of each 
possible templating base (X). Thus, the 5'-labelled substrate (dark grey) will be tested as primer, whereas the cold substrate (light grey; in which the 
X in the scheme is changed to A, C, G or T) is providing the template for the connection. Polymerization reactions were performed in the presence of 
200 nM Polu, 2.5 mM MgCl 2 and 100 uM of a single dNTP (complementary to X in each case). After incubation for lh at 30°C, reactions were 
stopped and loaded on 20% PA-8M urea gels. Labelled DNA fragments were detected by autoradiography. (B) Polymerization reactions performed 
as in (A), but using end-joining substrates used whose 3'-protrusions can be connected by three bases pairs with no distortion. 



Such gaps, eventually containing distortions, are 
common substrates at the second step of NHEJ, as they 
are generated after repairing the first strand of the DSB. 
Moreover, our results indicate that our model for the pro- 
duction of expansions of iterative dinucleotides is also 
valid during the first step of NHEJ (Figure 6F), where 
the critical requisite of generating a distortion upstream 
the primer terminus is expected to occur during 
end-bridging and search for microhomology. 

Relationship between terminal transferase and the 
generation of sequence expansions 

In most DNA-dependent DNA polymerases, proper pos- 
itioning of the 3' terminus is indirectly dictated by the 
enzyme's avidity for the templating base, thus configuring 
a binary complex ready to select the incoming nucleotide 
(ternary complex). Eventually, when no template base is 
available (blunt or 3'-protruding ends, or when a gap has 
been fulfilled), any further nucleotide addition is 
unfavoured, due to deficient translocation of the 
3' terminus, thus precluding addition of extra nucleotides. 
Template instruction is a general feature of most members 
of the X family, with the exception of TdT. Interestingly, 
Polti shows hybrid biochemical properties: it is strongly 
activated by a template DNA chain (4), but it has an in- 
trinsic terminal transferase activity, although weaker than 
TdT. A specific histidine residue, conserved between Polu 
(His 329 ) and TdT (His 342 ), but absent in PolB (Gly 189 ) or 
PolA. (Gly 426 ), confers terminal transferase activity as it is 
crucial and responsible for proper positioning of the 



primer terminus and the incoming nucleotide in the 
absence of a template (42). Mutating this histidine in 
Polu (10,42), and TdT (48) substantially reduced 
template-independent activity. As shown here, elimination 
of His 329 rendered Polu unable to perform sequence ex- 
pansions in the context of a gap. Therefore, the specific 
role of His 329 during Pole's catalytic cycle, facilitating 
primer translocation during both templated and non- 
templated nucleotide insertion (terminal transferase) has 
two sides, being beneficial for NHEJ of incompatible-ends 
but allowing, as a collateral effect, the eventual generation 
of large sequence expansions through the very same mech- 
anism of favoured primer translocation. Besides, PoIli's 
terminal transferase activity is negatively regulated by 
Arg 387 (10), acting as a brake for the necessary 
movement of the primer (thus counteracting His 329 ), to 
limit excessive nucleotide additions before end-bridging. 
The role of this residue in a non-distorted gap would be 
positive in terms of genome stability, because a braked 
primer translocation would help to reduce the number 
of extra nucleotide units added beyond gap-filling 
(Figure 6A). However, in a physiological context of 
DSB repair, where NHEJ produces gaps with eventual 
distortions (shown here to be a requisite for expansion), 
Arg 387 might not be able to maintain the contacts with the 
primer, allowing un-braked translocation and facilitating 
nucleotide expansion. In summary, our site-directed 
mutagenesis results support that the same mechanism 
that provides Polu with the ability to perform untemplated 
insertion of nucleotides (terminal transferase), beneficial 
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Figure 6. Mechanistic model for dinucleotide expansions generated by human Pol|r. Thick arrows indicate more efficient reactions, whereas thin 
arrows are related to slower or less favourable reactions. (A) Trinucleotide conformed by the same nucleotide. In this case, no distortion associated to 
gap-filling is generated, thus precluding a large expansion. (B) After polymerase binding and realignment of the primer terminus, a 
'slippage-mediated' dislocation is formed, creating a template distortion. In this situation, a large expansion reaction is observed. (C) In this case 
the distortion is induced by a 'dNTP-selection-mediated' dislocation of the template strand, again resulting in the generation of large sequence 
expansions. (D) In a NHEJ context, a repeated nucleotide neighbour to the 5'-P can induce nucleotide expansions by Polu, although in this case, a 
pre-existing stable distortion or impairing is not strictly required. 



for NHEJ of non-complementary ends, has an unexpected 
downside, since it also allows Polu to generate large 
sequence expansions in the context of repair reactions. 

Polu, a candidate to generate mono- and dinucleotide 
expansions in vivo 

The genome of most organisms thus far examined 
contains many tracts of repetitive DNA called microsatel- 
lites. The discovery that a number of human diseases are 
the direct consequence of mutations within such repeats 
has triggered considerable interest in the mechanisms that 
change the number of copies of repeated DNA sequences. 
DNA expansions in mono- and dinucleotide repeats are 
more likely to be deleterious to the cell by causing not only 
addition mutations but also frameshift mutations. Current 
models of DNA repeat instability involve DNA polymer- 
ase slippage at these iterative tracts, which are normally 



associated with mutation hot-spots. Which polymerases 
are responsible for expansions? Initial efforts were 
oriented to measure DNA polymerase-catalysed 'reitera- 
tive replication' of repeat sequences with replicative poly- 
merases (34,49-52), but evidence soon indicated that those 
lacking proofreading and also strand displacement 
capabilities are better candidates (53-55). Moreover, 
other results indicate that sequence expansions could be 
linked to DNA damage (32), a process causatively related 
with ageing (56,57). That could be a vicious cycle, as it is 
quite possible that repetitive DNA is a better target for 
DNA damage than normal DNA. Repair substrates as 
short gaps are produced in vivo during base excision 
repair, as well as during the final steps of nucleotide 
excision repair and post-replication MMR. More specific- 
ally, DNA polymerases as Polu are able to configure 
gap-like substrates during NHEJ. In all these substrates, 
the presence of a downstream strand would lead to the 
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'stalling' of the polymerization reaction (54), thus 
providing sufficient time for realigning the primer strand 
and triggering expansive nucleotide insertion. 

Our results suggest that Poltt can generate the expan- 
sion of mononucleotide tracts during these repair reac- 
tions in vivo, given the appropriate sequence context. It 
is not yet known whether homopolymeric runs are more 
prone to DSBs than non-iterative sequences, but the pos- 
sibility can be easily envisaged, due to the ssDNA- 
containing secondary structures that these sequences can 
adopt. As derived from our work, in the case of a DSB 
occurring at a site containing an iterative sequence as 
short as a single dinucleotide, the outcome of a Polu- 
mediated repair reaction would result in the generation 
of frameshifts and sequence expansions. This could in 
turn lead to an increased risk of microsatellite and 
genome instability, events that have been related to 
cancer and other illnesses such as neurodegenerative 
syndromes. 
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